Installing Python Packages Workshop

The University of Houston has many resources for students and staff to utilize, including Hewlett Packard Enterprise Data Science Institute’s (HPE DSI) High Performance Computing platforms: Carya, Opuntia and Sabine.

May 17, 2023 /

Isabelle Sitchon


 Usha Rani headshot

When installing Python packages on these clusters, which are housed in the Research Computing Data Core (RCDC) and managed by the UH Research Computing Center (RCC), many may run into job-specific errors or memory problems.

In an HPE DSI workshop, Usha Rani Nalla walks through the basic steps of installing Python packages on the HPC clusters using two main management systems: Conda Virtual Environment and PIP. Rani Nalla, who is currently pursuing her master’s degree in Computer and Information Sciences at UH, is a research computing facilitator and teaching assistant at the HPE DSI. She has more than five years of experience in Java/J2EE with spring boot applications and data management tools.

To start, Rani Nalla explains the fundamentals of a Conda environment, which is a directory that contains a specific collection of packages and can be switched according to the workload. She then demonstrates how to create and activate a Conda environment for scikit-learn software, providing examples of commands used in the process. Furthermore, she teaches how to use the scikit-learn virtual environment inside a batch job.

In the second half of the workshop, Rani Nalla discusses how to install and use a Python package within PIP, which is a package manager for the Python programming language. Using scikit-learn to demonstrate the basic steps of installation and utilization, Rani Nalla notes that the system, which is used to install and manage software modules that are not part of the standard Python library, is typically used in place of Conda packages and are best for easily updating existing ones.

How does a user know which management system to use? There are key differences with both softwares, according to Rani Nalla. PIP is typically used in more basic Python projects, as the system is easier to use. More complex projects are suitable with the Conda Virtual Environment, as users are able to install packages from multiple sources and manage different environments in a flexible manner.

To access the HPE DSI platforms, UH students and staff can request cluster access online. For more information about RCDC resources and information, please visit the RCDC website.


News Category
Institute Happenings
People
Research Topics